A Graph Based Method for Building Multilingual Weakly Supervised Dependency Parsers

نویسندگان

  • Jagadeesh Gorla
  • Anil Kumar Singh
  • Rajeev Sangal
  • Karthik Gali
  • Samar Husain
  • Sriram Venkatapathy
چکیده

The structure of a sentence can be seen as a spanning tree in a linguistically augmented graph of syntactic nodes. This paper presents an approach for unlabeled dependency parsing based on this view. The first step involves marking the chunks and the chunk heads of a given sentence and then identifying the intra-chunk dependency relations. The second step involves learning to identify the inter-chunk dependency relations. For this, we use an initialization technique based on a measure we call Normalized Conditional Mutual Information (NCMI), in addition to a few linguistic constraints. We present the results for Hindi. We have achieved a precision of 80.83% for sentences of size less than 10 words and 66.71% overall. This is significantly better than the baseline in which random initialization is used.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parse Imputation for Dependency Annotations

Syntactic annotation is a hard task, but it can be made easier by allowing annotators flexibility to leave aspects of a sentence underspecified. Unfortunately, partial annotations are not typically directly usable for training parsers. We describe a method for imputing missing dependencies from sentences that have been partially annotated using the Graph Fragment Language, such that a standard ...

متن کامل

Does it have to be trees?: data-driven dependency parsing with incomplete and noisy training data

We present a novel approach to training data-driven dependency parsers on incomplete annotations. Our parsers are simple modifications of two well-known dependency parsers, the transition-based Malt parser and the graph-based MST parser. While previous work on parsing with incomplete data has typically couched the task in frameworks of unsupervised or semi-supervised machine learning, we essent...

متن کامل

Effective Greedy Inference for Graph-based Non-Projective Dependency Parsing

Exact inference in high-order graph-based non-projective dependency parsing is intractable. Hence, sophisticated approximation techniques based on algorithms such as belief propagation and dual decomposition have been employed. In contrast, we propose a simple greedy search approximation for this problem which is very intuitive and easy to implement. We implement the algorithm within the second...

متن کامل

Data-Driven Dependency Parsing of New Languages Using Incomplete and Noisy Training Data

We present a simple but very effective approach to identifying high-quality data in noisy data sets for structured problems like parsing, by greedily exploiting partial structures. We analyze our approach in an annotation projection framework for dependency trees, and show how dependency parsers from two different paradigms (graph-based and transition-based) can be trained on the resulting tree...

متن کامل

Experiments in Newswire-to-Law Adaptation of Graph-Based Dependency Parsers

We evaluate two very different methods for domain adaptation of graph-based dependency parsers on the EVALITA 2011 Domain Adaptation data, namely instance-weighting [10] and self-training [9, 6]. Since the source and target domains (newswire and law, respectively) were very similar, instance-weighting was unlikely to be efficient, but some of the semi-supervised approaches led to significant im...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008